2 research outputs found
Spectral Batch Normalization: Normalization in the Frequency Domain
Regularization is a set of techniques that are used to improve the
generalization ability of deep neural networks. In this paper, we introduce
spectral batch normalization (SBN), a novel effective method to improve
generalization by normalizing feature maps in the frequency (spectral) domain.
The activations of residual networks without batch normalization (BN) tend to
explode exponentially in the depth of the network at initialization. This leads
to extremely large feature map norms even though the parameters are relatively
small. These explosive dynamics can be very detrimental to learning. BN makes
weight decay regularization on the scaling factors
approximately equivalent to an additive penalty on the norm of the feature
maps, which prevents extremely large feature map norms to a certain degree.
However, we show experimentally that, despite the approximate additive penalty
of BN, feature maps in deep neural networks (DNNs) tend to explode at the
beginning of the network and that feature maps of DNNs contain large values
during the whole training. This phenomenon also occurs in a weakened form in
non-residual networks. SBN addresses large feature maps by normalizing them in
the frequency domain. In our experiments, we empirically show that SBN prevents
exploding feature maps at initialization and large feature map values during
the training. Moreover, the normalization of feature maps in the frequency
domain leads to more uniform distributed frequency components. This discourages
the DNNs to rely on single frequency components of feature maps. These,
together with other effects of SBN, have a regularizing effect on the training
of residual and non-residual networks. We show experimentally that using SBN in
addition to standard regularization methods improves the performance of DNNs by
a relevant margin, e.g. ResNet50 on ImageNet by 0.71%.Comment: Accepted by The International Joint Conference on Neural Network
(IJCNN) 202
Weight Compander: A Simple Weight Reparameterization for Regularization
Regularization is a set of techniques that are used to improve the
generalization ability of deep neural networks. In this paper, we introduce
weight compander (WC), a novel effective method to improve generalization by
reparameterizing each weight in deep neural networks using a nonlinear
function. It is a general, intuitive, cheap and easy to implement method, which
can be combined with various other regularization techniques. Large weights in
deep neural networks are a sign of a more complex network that is overfitted to
the training data. Moreover, regularized networks tend to have a greater range
of weights around zero with fewer weights centered at zero. We introduce a
weight reparameterization function which is applied to each weight and
implicitly reduces overfitting by restricting the magnitude of the weights
while forcing them away from zero at the same time. This leads to a more
democratic decision-making in the network. Firstly, individual weights cannot
have too much influence in the prediction process due to the restriction of
their magnitude. Secondly, more weights are used in the prediction process,
since they are forced away from zero during the training. This promotes the
extraction of more features from the input data and increases the level of
weight redundancy, which makes the network less sensitive to statistical
differences between training and test data. We extend our method to learn the
hyperparameters of the introduced weight reparameterization function. This
avoids hyperparameter search and gives the network the opportunity to align the
weight reparameterization with the training progress. We show experimentally
that using weight compander in addition to standard regularization methods
improves the performance of neural networks.Comment: Accepted by The International Joint Conference on Neural Network
(IJCNN) 202